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Abstract 

We propose both spatial and non-spatial stochastic models for pathogen dynamics in the 
presence of an immune response. One of our spatial models shows that, at least in theory, a 
pathogen may escape the immune system thanks to its high mutation probability alone. While 
one of our non-spatial models also exhibits this behavior, another behaves quite differently 
from the corresponding spatial model. 


1 Introduction and results 

We study some simple mathematical models designed to test the following hypothesis: can a 
pathogen escape the immune system only because of its high probability of mutation? We 
propose both spatial and non-spatial models. In all of our models, we assume that pathogens can 
mutate, leading to the appearance of new types of pathogens. We also assume that the immune 
system is able to get rid of all the pathogens of a given type at once but that it recognizes only 
one type at a time. 

1.1 Non-spatial models 

For our non-spatial models, we start with a single pathogen at time zero. Each pathogen gives 
birth to a new pathogen at rate A. When a new pathogen is born, it has the same type as its 
parent with probability 1 —r. With probability r, a mutation occurs, and the new pathogen has a 
different type from all previously observed pathogens. For convenience, we say that the pathogen 
present at time zero has type 1, and the /cth type to appear will be called type k. Note that we 
assume the birth rate A to be the same for all types and we therefore ignore selection pressures. 

If there are no deaths, then this is a model of a Yule process with infinitely many types, 
which goes back to Yule (1925). The model with no deaths was studied recently by Durrett and 
Schweinsberg (2005), who focused on the joint distribution of the number of pathogens of each 
type. Here we assume that the response of the immune system can eliminate pathogens of a 
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given type. We propose three different models for the behavior of the immune system. In all 
three models the pathogens give birth and mutate as described above. Each model corresponds 
to a different immune response. 

Model 1: At times of a rate 1 Poisson process, a death event occurs. When there is a death 
event, if there are k types of pathogens alive, then one of the types is chosen at random, each 
with probability 1/A:, and all pathogens of that type are simultaneously killed. 

Model 2: When a new type appears in the population, it survives for an exponential amount 
of time with mean 1, independently of all the other types. All pathogens of the type are killed 
simultaneously. 

Model 3: Each pathogen is born with a mean 1 exponential clock. When the clock goes off the 
pathogen is killed as well as all the pathogens of the same type. 

To understand the models better, note that if there are k types and N total pathogens, then 
the total rate of death events is 1 in Model 1, k in Model 2, and N in Model 3. Also, if there are 
Hi pathogens of type i, the rate at which type i is being killed is 1/A: in Model 1, 1 in Model 2, 
and rii in Model 3. Thus, in models 1 and 2, the rate at which a type is killed does not depend 
on the number of pathogens with that type, but in Model 3, types that have large numbers of 
pathogens are more likely to be targeted by the immune system and eliminated. Model 2 is 
similar to random graph models studied by Chung and Lu (2004) and Cooper, Frieze, and Vera 
(2004) with preferential attachment (corresponding to births) and vertex deletion (corresponding 
to deaths). 

With all three models, there is a positive probability that the immune system will succeed in 
eliminating the pathogens, as the first pathogen could die before it has any offspring. Our main 
result for the non-spatial models is the following theorem, which specifies the values of r and A 
for which there is a positive probability that the pathogens survive, meaning that for all t > 0, 
there is at least one pathogen alive at time t. 

Theorem 1. Assume A > 0 and r > 0. 

1. For Model 1, the pathogens survive with positive probability. 

2. For Model 2, the pathogens survive with positive probability if and only if X> 1. 

3. For Model 3, the pathogens survive with positive probability if and only if r > 1/A. 

Thus, the three models produce very different behavior. In Model 1, which is the model in 

which the death rates are lowest, any positive probability of mutation is enough to allow the 
pathogen to escape the immune system. For Model 2, whether or not the pathogen can survive 
depends only on the reproduction rate and not on the mutation rate. For Model 3, there is a 
phase transition, in that for fixed A, the pathogens can escape the immune system if r > 1/A but 
not if r < 1/A. The proof of Theorem ^ is given in Section 2. 
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1.2 Spatial models 

We now introduce three spatial models that correspond to the three non-spatial models above. 
Each spatial stochastic model is on the lattice where the dimension d can be any positive 
integer. Every site of Z'^ is either occupied by a pathogen or empty. Each model is started with 
a single pathogen at the origin of Z'^ and with all other sites empty. 

Rules for births and mutations are the same for the three spatial models. Let x be a site 
occupied by a pathogen and y be one of its 2d nearest neighbors. After a random exponential 
time with rate A, the pathogen on x gives birth on y, provided y is empty (if y is occupied 
nothing happens). With probability 1 — r the new pathogen on y is of the same type as the 
parent pathogen on x. With probability r the new pathogen is of a different type. We assume 
that every new type that appears is different from all types that have ever appeared. 

In addition to the birth and mutation rates decribed above, the three spatial models, which 
we call SI, S2 and S3, have the same rules for the immune responses as non-spatial models 1, 
2 and 3, respectively. We start with a result for Model SI. The result shows that this model 
produces the same behavior as the corresponding non-spatial model. 

Theorem 2. Consider Model SI on Z'^ for d > 1. For every A > 0 and r > 0, the pathogens 
have a positive probability of surviving. 

We now turn to models S2 and S3. If r = 1, then every birth gives rise to a new type in 
models S2 and S3. Since all pathogens are of different types there is only one death at a time 
in both models. If we ignore the types, the process of occupied sites is the well-known contact 
process. The contact process has a critical value Ac which depends on the dimension d of the 
lattice. If A < Ac the pathogens die out, while if A > Ac there is a positive probability that 
pathogens will survive forever. For more on the contact process, see Liggett (1999). 

Theorem 3. 1. Consider Model S2 with A < l/2d. For all r in [0,1] the pathogens die out 

with probability 1. 

2. Consider Model S3 with A < Ac. For all r in [0,1], the pathogens die out with probability 1. 
Theorem 4. Consider Models S2 and S3 on Z'^ for d> 1 with parameters A and r. 

1. For any A > Ac, there is an n in (0,1) such that if r <ri, then the pathogens die out. 

2. For any A > Ac, there is an r 2 in (0,1) such that ifr>r 2 , then the pathogens survive with 
positive probability. 

We conjecture that for both Model S2 and S3 there is a critical value Tc such that the pathogens 
die out if r < Tc and survive if r > rc. This would follow from our results if we could prove, for 
instance, that the probability of pathogen survival is increasing in r. However, it is not clear that 
this is a true statement. 

While Models 3 and S3 behave in a similar way, models 2 and S2 are strikingly different. In 
particular. Model S2 exhibits a phase transition in r (survival of pathogens for large r, death for 
small r) and Model 2 does not. 

Theorems 01 and |1] will be proved in Section 3, and Theorem [2 will be proved in Section 4. 
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1.3 Discussion 

In this paper we propose a new class of models for immune response. Our main assumption is 
that the immune system is able to get rid of all the pathogens of a given type at once. As far as 
we know these are the first models that attempt to mimic the “central command” nature of the 
immune system with global killing rules. There is increasing evidence that the immune system 
has a coordinated global behavior. See in particular Silvestri and Feinberg (2003) who argue that 
the pathogenesis of AIDS is caused by chronic immune activation rather than direct attack of 
the HIV virus. In contrast, most of the existing models are predator-prey differential equations 
models with local killing rules, see De Boer and Perelson (1998), Iwasa et al. (2004) and Nowak 
and May (2000). There exist also a number of spatial models, in particular cellular automata, 
to model HIV infections, see Perelson and Weisbuch (1997), Bernaschi and Castiglione (2002) or 
Zorzenon (1999). However, these models, unlike ours, are quite complex, use local killing rules 
and are not analyzed rigorously. 

Our original motivation for the introduction of our models is to to test the following hy¬ 
pothesis: can the immune system be overwhelmed by a particular virus only because of its high 
probability of mutation? Ordinary differential equation models have been used to test this hy¬ 
pothesis. In particular, Nowak and May (2000) (see Sections 12.1 and 12.2) introduce models of 
increasing complexity to get a behavior similar to the behavior exhibited by our simple models 
3, S2 and S3 (pathogens die out for small mutation rate and survive for large mutation rate). 
Sasaki (1994) uses a partial differential equation model in which all types of pathogens have the 
same reproduction rate. However, his analysis yields results strikingly different from ours. In 
particular, he finds that the pathogens may survive only if the mutation rate is intermediate. If 
the mutation rate is too low or too high, the pathogens die out in his model; see in particular his 
results for the infinite allele model. 

The rest of the paper is devoted to proofs. We are able to give short proofs for all our results 
except for Theorem 2 when d = \. While this is probably not the most biologically significant 
model we feel that our mathematical analysis is worthwhile. The proof that an interacting particle 
system survives is almost always done by coupling the system to a much simpler one. This is the 
case in this paper for all spatial models except for Model S2 in d = 1 for which we could not find 
such a coupling. Instead, to prove that the pathogens survive we do a “pathwise” analysis of our 
model. This is a rather delicate analysis, but it yields a lot more information about the behavior 
of the system than the coupling technique does. 

2 Analysis of the non-spatial models 

Theorem ^ can be proved using standard branching process techniques. We begin with the 
analysis of Model 1, which can be carried out using a comparision with a birth-death process. 

Proof of part 1 of Theorem^ Let X[t) be the number of different types of pathogens alive at 
time t. Thus, V(0) = 1, and we need to show that for all r > 0, we have P{X{t) > 0 for all t) > 0. 
Since r > 0, we can choose N such that NXr > 1. Let Tq = inf{t : X{t) = N}. With positive 
probability, the pathogen present at time zero gives birth to pathogens of N new types before 
the first death event. Therefore, P{To < oo) > 0. For positive integers n, inductively define, on 
the event that r„_i < oo, the stopping time = inf{t > T„_i : X{t) / X(T„_i)}. That is, Tn 
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is the first time after r„_i that either a pathogen of a new mutant type is born, in which case 
X{Tn) = X(Tn-i) + 1, or one of the types is eliminated, in which case X{Tn) = X{Tn-i) — 1. 

Since there must be at least one pathogen of each type, and each pathogen gives birth to 
pathogens of new types at rate Ar, at time t the rate at which new types are being born is at 
least X{t)Xr. Therefore, whenever X{t) > N, the rate at which new types are being born is 
at least N\r. The rate of death events, which cause a type to be eliminated, is always 1. Let 
p = N\r/{1 + NXr) > 1/2. Then for all k > N, we have P{X{Tn) = k + l|X(T„_i) = k) > p. 
Now, consider a birth-death process such that Yq = N and, for all integers k, we have 

P(Vn = k+l\Yn-i = k) = p and P{Yn = A: —= k) = 1—p. It is well-known, see for instance 
Hoel, Port and Stone (1972), p. 32, that P{Yn > N for all n) = {2p — l)/p > 0. Therefore, by 
comparing the processes (X(T„))))Tq and (Tn)))TQ, we see that P{X{Tn) > N for all n|To < oo) > 
{2p — Y)/p. On the event that Tq < oo and X{Tn) P X for all n, we have X{t) > 0 for all t. 
Therefore, the probability that the pathogens survive is at least P(Tq < oo){2p — l)/p > 0. □ 


To study Models 2 and 3, we will construct a tree which keeps track of the genealogy of the 
different types of pathogens. Each vertex in the tree will be labeled by a positive integer. There 
will be a vertex labeled k if and only if a pathogen of type k is born at some time. We draw a 
directed edge from j to k if the first pathogen of type k to be born had a pathogen of type j as its 
parent. This construction gives a tree whose root is labeled 1 because all types of pathogens are 
descended from the pathogen of type 1 that is present at time zero. Since every type is eliminated 
eventually, we have X(t) >0 for all t if and only if infinitely many different types of pathogens 
eventually appear or, in other words, if and only if the tree described above has infinitely many 
vertices. 

For Model 1, the rate at which a given type is killed depends on the number of other types 
present. However, for Models 2 and 3, the rate at which a type is killed is either constant in the 
case of Model 2, or depends only on the number of pathogens of the type in the case of Model 3. 
Consequently, once the first pathogen of type k is born, the number of mutant offspring born to 
type k pathogens is independent of how the other types evolve. Therefore, the tree constructed 
above is a Galton-Watson tree, and the process survives with positive probability if and only if 
the mean of the offspring distribution is greater than one, see for instance 1.9 in Schinazi (1999). 
This observation can be used to prove parts 2 and 3 of Theorem^ We begin with part 3, which 
is simpler. 


Proof of part 3 of Theorem^ Whenever there are n pathogens of a given type, the event in 
which the type is destroyed is happening at rate n, while events in which pathogens of the type 
give birth to offspring of new mutant types are happening at rate nrX. Therefore, regardless of 
the number of pathogens of the given type, the probability that the type is destroyed before the 
next birth to a mutant type is n/(n -|- nrX) = 1/(1 -|- rA), and the probability that an individual 
gives birth to a mutant offspring before the type is destroyed is rX/{l + rA). Suppose X is the 
number of types that are offspring of a given type. Then for k > 0, 


P{X = k) 


f rX AV 1 A _ {rXf 
\l-|-rAy \l-|-rAy (l-|-rA)^+^ 


That is, X -|- 1 has the geometric distribution with parameter 1/(1 -|- rA). It follows that the 
mean of the offspring distribution is greater than one if and only if r > 1/A. As discussed above, 
this is the condition for the process to survive with positive probability. □ 
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Proof of part 2 of Theorem^ If r = 1, then there is only one individual of each type, so we have 
births at rate A and deaths of a single individual at rate 1. In this case, the result is a standard 
fact about branching processes. Now suppose r < 1. Let m' be the number of type 1 pathogens 
which are offspring of the initial pathogen. Note that the type 1 pathogens evolve like a Yule 
process with births at rate A(1 — r) until the type 1 pathogens all die at time T, which has an 
exponential distribution with mean one. If Y(t) denotes the number of type 1 pathogens at time 
t, then conditioning on the value of T gives 

POO POO POO 

m' + l = E[Y{T)] = / e-^E[Y{t)] dt= dt= e-(i-A(i-r))t 

Jo Jo Jo 

It follows that m' = oo if A(1 — r) > 1 and m' = A(1 — r)/(l — A(1 — r)) if A(1 — r) < 1. 

Now, let m be the mean number of different types that are offspring of type 1 pathogens. 
Because each type 1 pathogen gives birth to new types at rate rA and to other type 1 pathogens 
at rate (1 — r)A, we must have m = rm!jifY — r). Therefore, m = oo if A(1 — r) > 1 and 
m = 'rA/(l — A(1 — r)) if A(1 — r) < 1. It follows that m > 1 if and only if A > 1. Thus, the 
process survives with positive probability if and only if A > 1. □ 

3 Analysis of the second and third spatial models 

In this section, we prove Theorems |31 and |11 which pertain to the spatial models S2 and S3. 

Proof of part 1 of Theore'm\^ Suppose A < l/2d. We may couple Model S2, in which an indi¬ 
vidual gives birth on each of the 2d neighboring sites at rate A, with Model 2, in which each 
individual gives birth at rate X' = 2d\. In both models, each type dies at rate 1, and each 
pathogen gives birth at rate 2dX. However, a birth that occurs in Model 2 will be suppressed in 
Model S2 if the site on which a pathogen is to give birth is already occupied. Hence, at any given 
time. Model 2 has at least as many types as Model S2, and each type in Model 2 has at least as 
many individuals as the corresponding type in Model S2. 

According to Theorem 1.2, if X' < 1 then the pathogens in Model 2 die out with probability 
1. Using the coupling above, one sees that the same is true for Model S2 if A < l/2d. □ 

Proof of part 2 of Theorem\^ As noted before, when r = 1 the process of occupied sites is a 
contact process for models S2 and S3. Since A < Ac the pathogens die out. We may couple, site 
by site. Model S3 with r < 1 to Model S3 with r = 1. Deaths occur simultaneously in both 
models at the same rate 1, but every time there is a death in the process with r = 1, a single 
pathogen dies while for the model with r < 1 all pathogens of the same type die. Birth rates of 
pathogens are the same. It is easy to see that, with this coupling, the model with r < 1 has fewer 
occupied sites than the model with r = 1, at all times. Since the pathogens die out for Model S3 
with r = 1 they also must die out for S3 with r < 1. □ 

Proof of part 1 of Theorem^ We write the proof in dimension d = 2. The same ideas work in 
any d > 1. We start by defining two space-time regions. 

A = [-2L, 2L]2 X [0,2r] B = [-L, if x [T, 2r]. 

Note that B is nested in A. Let C be the boundary of A: 

C = {(m, n,t) G A : \m\ = 2L or \n\ = 2L or t = 0}. 
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We define the model restricted to ^ + {kL, mL, nT) as the model with the same birth and 
death rates as the process on 7p‘ with the restriction that a pathogen in the complement of 
[—2L, 2L]^ + {kL,mL) cannot give birth inside [—2L,2L]^ + {kL,mL) between times nT and 
(n + 2)T. 

We will compare our models S2 and S3 to a percolation process on 7? x Z+. We declare 
{k, m, n) in x Z_|_ to be wet if for the process restricted to ^ + {kL, mL, nT) there is no 
pathogen in + {kL,mL,nT). Moreover, we want no pathogen in B + {kL,mL,nT) for any 
possible configuration of the boundary C + {kL,mL,nT). 

Let e > 0. We are going to show that given A > 0, there exists ri in (0,1) such that 

P{{k, m, n) is wet) > 1 — e if r < n. 

Consider first models S2 and S3 restricted to A and with r = 0. That is, a pathogen born inside 
A is always of the same type as its parent. Note that if there is a pathogen inside B there must 
be a line of infection from the boundary C of M to B. This line of infection has either started at 
the bottom of C (i.e. at time 0) or on one its sides (i.e. at a time different from 0). We will now 
show that these possibilities have all exponentially small probability for both models S 2 and S3. 

Since the line of infection cannot change type inside A, if there is a line of infection from the 
bottom of C to B then the type of the pathogens making up the line of infection must last at 
least T. The death rate of a type is 1 for Model S2 and is at least 1 for Model S3. Hence, the 
probability that there is a line of infection from the bottom of C to B is less than e“^. Note that 
there are (4L + 1 )^ sites at the bottom of C. 

We now deal with a line of infection from a side of C. The minimum distance between a side 
of C and B is L. Starting from a site x on a side of C there are positive constants c, C and 7 
depending on A and such that the probability that a line of infection starting at x reaches B by 
time cL is less than Ce~'^^ (see for instance Lemma 9, p. 16 of Durrett (1988)). This estimate 
takes into account only births and so is valid for models S2 and S3. If the line of infection takes 
at least cL units of time to get to B then the type of the infection line from the side of C must 
last at least cL. For both models this has a probability less than Putting together these 

estimates we get 

P((0,0,0) is wet) > 1 - (4L + - 8T{2L + l)Ce“^^ - 8T{2L + l)e“‘^-^ for r = 0. 

By taking L = T large enough we get 

P(( 0 , 0 , 0 ) is wet) > 1 — e /2 for r = 0 . 

Given that M is a finite box it is possible to find ri > 0 (depending on A and e) so close to 0 that 
if r < ri there is no creation of a new type in A with probability at least 1 — e/ 2 . Thus, with 
probability at least 1 — e /2 models S 2 and S3 restricted to A and with r < ri are coupled with 
models S2 and S3 with r = 0, respectively. Hence, 

P((0,0,0) is wet) > 1 — e for r < ri. 

By translation invariance, the same is true for any site {k,m,n) of Z^ x Z-|_. We now define a 
percolation process on TA x Z+ with finite range dependence. Let 

A{k,m,n) = {kL,mL,nT) + A. 
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For each element {k, m, n) in 7? x Z+ we draw an oriented edge from (/c, m, n) to (x, y, z) if 
n < z and A{k, m, n) n A{x, y, z) A 0- The wet sites in the ensuing directed graph constitute 
a percolation model. The dependence of this percolation model has finite range because the 
event that {k, m, n) is wet depends only on the Poisson processes inside A{k^ m, n) and this box 
intersects only finitely many other boxes A{x, y, z). 

Note that if there is a pathogen somewhere at some time then there must be a path of dry 
sites in the percolation process. Since a dry (i.e. not wet) site has probability e in this percolation 
process, by taking e > 0 small enough one can make the probability of a path of dry sites of length 
n decrease exponentially fast with n. This in turn implies that for any given site there will be no 
pathogen after a finite random time, see (8.2) in van den Berg et al. (1998). □ 

Remark. This proof breaks down for Model SI for at least two reasons. In the proof above it is 
crucial to have a lower bound on the death rate of a type. For Model SI there is no such lower 
bound; the death rate of a type is 1/k if there are k types and there is no upper bound on k. 
Moreover, what happens inside a finite space-time box for Model SI depends on how many types 
there are in the whole space. Hence, there is little hope to compare Model SI to a finite range 
percolation model as we did for models S2 and S3. 

Proof of part 3 of Theorem^ Let ei be the vector (1,0,..., 0) in Z'^ 

B = [—2L, 2L\^ X [0, T] Bm,n = (4mLei, 50nT) -|- B 

i = [-j,jY 

L = {(m, n) G Z^ : m -|- u is even}. 

We declare (m, n) G £ to be wet if there is (x, t) in Rm,n such that each site of the interval x+I 
is occupied by a pathogen at time t for the process restricted to (4Lmei, 50nT) -|- (—6L,6L)‘^ x 
[o,5ir]. 

Set r = 1 in the box (—6L,6L)'^ x [0,51T]. As noted before the set of occupied sites is a 
contact process for models S2 and S3. Since A > Ac it is a supercritical contact process. 

Bezuidenhout and Grimmett (1990) have shown that for a supercritical contact process, and 
for any e > 0, J, L and T can be chosen so that if (0,0) is wet then with probability 1 — e, (1,1) 
and (—1,1) will also be wet. Here we are following the approach and notation of Durrett (1991). 
More precisely, for any e > 0 we can pick J, L and T such that 

P((l, 1) and (—1,1) are wet|(0,0) is wet) > 1 — e for r = 1. 

Since (—6L,6L)'^ x [0,51T] is a finite space-time box, we can pick r 2 so close to 1 (but strictly 
smaller) that for models S2 and S3 with parameters A and r > r 2 all births inside (—6L,6L)'^ x 
[0, 51T] are of a new type with probability at least 1 — e. Therefore, the process of occupied sites 
for models S2 and S3 and r > r 2 may be coupled to a contact process with probability at least 
1 — e. Hence, for models S2 and S3 we have 

P((l,l) and (—1,1) are wet|(0,0) is wet) > 1 — 2e for r > r 2 - 

By picking e > 0 small enough we can show that there is a positive probability of an infinite 
wet cluster in C. This, in turn, implies that pathogens have a positive probability of surviving 
forever, see Durrett (1991) for more details. □ 


4 Analysis of the first spatial model 

In this section, we consider Model SI and prove Theorem |21 which says that the pathogens have 
positive probability of surviving whenever A > 0 and r > 0. This result is easiest to prove in 
d >2, when there are always many neighboring sites on which the pathogens can give birth. We 
begin by proving the result in this case. 

Proof of Theorem\^for d> 2. Assume that at some time there are n different types of pathogens 
in Model SI. Thus, there are k > n occupied sites. It is easy to see that, if d > 2, at least Vk 
occupied sites have at least one empty neighbor. Therefore, the rate at which the number of 
types goes from n to n + 1 is at least Xr^/k > Xr^/n. On the other hand the rate at which the 
number of types goes from n to n — 1 is 1. Hence, the number of types in Model SI is at least as 
large as a birth and death chain with transition rates: 

n —> n + I at rate Xr^/n 

n ^ n — 1 at rate 1 

An argument very similar to the one in the proof of Theorem 1.1 shows that for all A > 0 
and r > 0, there is a positive probability that this chain never reaches zero. Since there are at 
least as many types in Model SI as there are individuals in the birth and death chain, there is 
a positive probability that the number of types in SI does not reach zero. This completes the 
proof of Theorem 2 for d >2. □ 


We devote the rest of this section to the case d = 1. This case is more complicated because if 
there are n occupied sites, there could be as few as two sites with an empty neighbor. Nevertheless, 
we will be able to show that the number of occupied sites grows linearly in time because the sites 
on the far left and the far right of the configuration will give birth at rate A, while deaths create 
“holes” in the configuration that usually fill up quickly. 

We begin by introducing some notation. Let N(t) be the number of pathogens alive at time 
t, and let Nk(t) be the number of type k pathogens alive at time t. Let be the set of sites 
that are occupied by a type k pathogen at some time. It is easy to see that Sk is an interval. Let 
Cfc be the time at which the type k pathogens die, with the convention that = oo if no type k 
pathogen ever dies. Define 

L{t) = inf{x : there is a pathogen at site x at time t}, 

R{t) = sup{x : there is a pathogen at site x at time t}, 

so all the pathogens at time t are contained in the interval [L{t), R{t)]. Fix a positive integer T. 
Let D{t) be the number of types that die before time t, and let X(t) be the number of times 
s with T < s < t such that, at time s, either the type occupying the site L(s—) or the type 
occupying the site R{s—) dies at time s. Let Y{t) be the number of times s < t such that, 
at time s, either the pathogen at site L{s—) gives birth on site L{s—) — 1 or the pathogen on 
site R{s—) gives birth on site R{s—) + 1. Let C = inf{t : N{t) = 0} be the time at which the 
pathogens die out, with the convention that C = cc if A^(t) > 0 for all t. Let 7 = min{l, A/ 6 }, 
and let k = inf{t : N{t) < yt}. 
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Assume that the initial configuration consists of 6 T pathogens, all of different types labeled 
1, ..., 6 T, with a pathogen at each of the sites {—3T + 1,... ,3T}. Fix positive constants Ci, 
C 2 , and C 3 , and define the following six events: 

• Let Ai be the event that D(t) < 2t for all t > T. 

• Let A 2 be the event that Y{t) < 3Xt for all t > T and Y{t) > Y{T) + X{t — T) for all 

2T <t<C- 

• Let A 3 be the event that for all t > T, we have max max < Ci logt. 

0 <s<t k 

• Let A 4 be the event that for all t > 2T, at most C 2 log t different types die between times 
t — C 3 logt and t. 

• Let A 5 be the event that for all 2T < t < k, we have X{t) < 

• Let Aq be the event that for all 2T < t < ac, all /c G N such that Ck Y t — C 3 logt, and all 
X G Sk, there exists a time s with Ck < s < t such that either x < L{s), x > R{s), or the 
site X is occupied at time s. 

Proposition 5. Let e > 0. Then there exist positive constants Ci, C 2 , and C 3 such that 

> l-15e 

'^*=1 ^ 


for sufficiently large T. 

Before proving this proposition, we show that it implies Theorem|21for d = 1. The idea is that, 
on A 2 , the length of the interval between the left-most pathogen and the right-most pathogen 
increases linearly. On A 3 , at most Ci logt pathogens can die at time t, and on Ai n A 4 , deaths 
are sufficiently infrequent. On Ag, when a type dies creating a “hole” in the configuration, it fills 
up quickly, so that the interval between the left-most and right-most pathogens is mostly filled 
by pathogens. This interval can get shorter when the left-most or right-most pathogen is killed, 
but on Ag such deaths occur infrequently. 

Proof of Theorem\^for d= If the process starts from a single pathogen at time zero, then with 
positive probability we eventually reach the configuration described above, with 6 T pathogens 
of different types on sites {—3T -|- 1,... ,3T}. Therefore, it suffices to show that if the process 
starts from a configuration with GT pathogens of different types on {—3T -|- l,...,3r}, then 
with positive probability the process survives forever. We will show that for sufficiently large T, 
we have N{t) > yt for all t on the event Theorem |21 for d = 1 will then follow from 

Propositional 

On Ai, we have D(2T) < 4T, so at least 2T of the GT pathogens alive at time zero must 
survive until time 2T. Therefore, for all t < 2T, we have N{t) > 2T > t > •jt. 

Now assume 2T < t < n. Suppose L{t) < x < R{t) and there is no pathogen at x at time 

t. Because L{t) < x < R{t), the site x must have been occupied at some time before t. Let 

u = sup{s < t : site x is occupied at time s}. Suppose u < t — C^logt. Then on Ag, there is 

a time s G {u,t) such that either x was occupied at time s, x < L{s), or x > R{s). However, 
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since L{t) < x < R{t), it follows that in the latter two cases, x must be occupied at some time 
in which contradicts the definition of u. Thus, on Aq, we have u>t — Cslogt. It follows 

that the number of vacant sites x such that L(t) < x < R{t) is at most the number of pathogens 
that die between times t — C^logt and t which, on the event n A 4 , is at most C'iC 2 (log 
This means that 

N{t) > R{t) - Lit) - CiC2(logt)2. (1) 

We now compare T(t) to i?(t) — Lit). There are T(t) — T(r) times at which the process 
(-R(s) — -L(s),T < s < t) increases by one. There are at most Xit) times when the process 
decreases because of deaths. Suppose, at time s G (T,t], the pathogen at Lis—) or Ris—) dies. 
Note that (i2(s—) — Lis—)) — (i2(s) — Lis)) is at most the number of pathogens that die at time 
s plus the number of vacant sites between Lis—) and Ris—). The number of pathogens that die 
is at most Ci logt on A 3 , and we just showed that the number of vacant sites between Lis—) and 
Ris—) is at most CiC 2 (logt)^ on A 3 n A 4 n Ag. It follows that for 2T < t < k, we have 

Rit)-Lit) > (i?(r)-L(r)) + (y(f)-y(r))-x(f)(Ciiogt + CiC 2 (iogt) 2 ). ( 2 ) 

Now Xit) < on Ag. On A 2 , we have y(t) — y(T) > A(t — T) > At/2 > Syt. Also, 
i?(r) — T(r) > 0. Therefore, by combining with ((2)), we see that on 

Nit) > Syt - t^/2(Cilogt + CiC2(logt)2) - CiC2(logt)2. 

It follows that for sufficiently large T, we have Nit) > 2'yt whenever 2T < t < k. 

To show that Nit) > jt for all t on for sufficiently large T, it remains to show that 

K = 00 on for sufficiently large T. Because Nit) > 'jt for t < 2T on Ai, we have n > 2T. 

Suppose K < 00 . On A 3 , we have A^(k—) — Nin) < Ci logK because at most Ci logK pathogens 
can die at time k. However, on we have Nin-) > 2 '^k. and Nin) < 7 K, so for T large 

enough that Ci log(2T) < 2'^T, we must have k = 00 . □ 

Proposition[3will follow from Lemmas IHIIHI 11171II IL and ll 2 l below. Once T is chosen sufficiently 
large, Lemmal^lwill imply H(Ai) > 1 —e and T’(A 2 ) > 1 —e. Lemma|Hlthen gives PiA^) > 1 —3e, 
and Lemma 11171 gives P(A 4 ) > 1 — e. Finally, Lemma ITTI implies ^’(Ag) > 1 — 4e and it follows 
from Lemma El that T’(Ae) > 1 — 5e. Our first step will be to bound the probabilities of Ai and 
^ 2 - 

Lemma 6 . Let e > 0. For sufficiently large T, we have T’(Ai) > 1 — e and T’(A 2 ) > 1 — e. 

Proof. Until time Cj deaths occur at times of a rate 1 Poisson point process. Let D'it) denote 
the number of points of a rate one Poisson process before time t, which can be coupled with the 
death process in such a way that Dit) = Dft) for all t < C- We have t ^D'it) 1 a.s. It follows 
that T’(Ai) > 1 — e for sufficiently large T. 

Likewise, until time C, the pathogens at sites L(t) and Rit) each give birth on sites L(t) — 1 
and Rit) + 1 respectively at rate A, so these births occur at times of a Poisson point process of 
rate 2A. Let Y'it) denote the number of points of a rate 2A Poisson point process up to time t, 
coupled with the particle system so that y(t) = Y^t) for t < C- Then t~^Y'it) —> 2A a.s. and 
it — r)“^(y'(t) — Y'iT)) 2A a.s. It follows that T’(A 2 ) > 1 — e for sufficiently large T. □ 

We next work towards bounding the probability of A 3 . The first step is to bound the proba¬ 
bility that the number of pathogens of a given type is high. 


11 


Lemma 7. Let Nk{t) be the number of pathogens of type k at time t. Then there exist positive 
constants C 4 and C 5 such that for all 0 < r < 1 and all a, we have 

P(maxA^fc(t) > a) < < 746 “*"®“. 


Proof. It is clear from the description of the model that at any time t at which there are pathogens 
of type k, the set of sites occupied by type k pathogens is an interval of the form {at, flt + l, • • •, h}. 
The maximum number of type k pathogens at any time can therefore be written as 1 + Y + Z, 
where Y is the number of times that the type k pathogen on the far left of the interval gives birth 
on the site to its left, and Z is the number of times that the type k pathogen on the far right of 
the interval gives birth on the site to its right. 

Let Z\ be the number of times that the type k pathogen at site bt gives birth to another type 
k pathogen on site ht + 1, until the first time that the site 6 * + 1 is occupied by a pathogen of 
another type. Because each pathogen born is a new type with probability r, the distribution 
of Zi + 1 is dominated by the geometric distribution with parameter r. Once a different type 
occupies site ht + 1 , the type k pathogen at bt can not give birth again at site 6 * + 1 unless the 
type at 6 ^ + 1 dies before type k dies, which happens with probability 1/2. It follows that the 
distribution of Z is dominated by the distribution of + • • • + Zw, where N has the geometric 
distribution with parameter 1 / 2 , Zj + 1 has the geometric distribution with parameter r for all 
i, and N is independent of Zi, Z 2 ,.... Therefore, 


p(^Z > ^ ~ n)p(^Zi > ^— for some i G { 1 ,... ,n} 


n=l 

00 


a — 1 
2 n 


S E ^(1 - < (1 - ^ -(1 - (3) 

n=l n=l 


In the sum on the right-hand side of ©, the ratio of the (n + l)st term to the nth term converges 
to 1/2 as n ^ 00 , and therefore is less than 3/4 for all n > M for some integer M. For this M, 
we have 


P 






1-3/4 J - 2 


where 6*4 = (1 — r)“^/^M(l + 2^~^) and Cs = — log(l — r)/2M. By the same argument, we get 
P(Y > (a — l)/2) < (( 74 / 2 ) 6 “*^®“. Since the maximum number of type k pathogens is 1 + T -|- Z, 
the result follows. □ 


Lemma 8 . Let e > 0. There is a constant (7i such that for sufficiently large T, we have 
P{Al(^Alf^A 2 ) < e. 

Proof. Suppose t > T. On ^ 42 , there is a set of at most 2>Xt + QT < 3(A - 1 - 2)t sites at which there 
has been a pathogen at some time s < t. Since there are at most 2t deaths before time t on ^ 1 , 
no site can be occupied by more than 2t + 1 different pathogens before time t. Therefore, on 
Ai n A 2 , at most 3(A + 2 )( 2 t + l)t different types of pathogens can be born by time t. 

For positive integers n, let tn = T‘^ . Given a constant (7i, let be the event that for some 
k < 3(A-|-2)(2t„-|-l)t„, we have Nk{s) > ^Ci logtn for some s. On the event A^riAinA 2 , there is 
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at > T and a k < 3(A + 2){2t + l)i such that Nk{s) > Ci logt for some s. If n = inf{m : tm >t}, 
then k < 3(A + 2){2tn + l)tn and Nk{s) > Cilogt > Cilogtri-i = \Ci\ogtn. Therefore, on 
Ag n n ^2, the event occurs for some n. By Lemma [3 we have 

P{Bn) < 3(A + 2){2tn + l)tn ' = 3C4(A + 2){2tn + 

Therefore, 

OO 

P{Al n n A 2 ) < 3C4(A + 2) ^(2t„ + 

n=l 

If we choose Ci large enough that 2—C\C^/2 < 0, then this expression is less than e for sufficiently 
large T. □ 

The next two results bound the probabilities of A4 and A^. Both of these events pertain to 
the number of deaths. The proofs make use of the fact that, up to time C) deaths occur at times 
of a rate one Poisson process. We first state a lemma related to the gamma distribution, which 
we can use to choose the constant (Ps, now that Ci has already been chosen. The reason for this 
choice will become clear later. We will then choose C 2 in Lemma cni 

Lemma 9. There exists a constant C3 sueh that if X has a gamma distribution with shape 
parameter 2Ci log(t+C'3 logt) and scale parameter 2X, then P{X > C^, logt) < t“^ for suffieiently 
large t. 

Proof. If 0 < 0 < 2A, then E[e^^] = [2A/(2A — it follows from Markov’s 

Inequality, taking 6 = X, that 

P{X > C3logt) < e-^^3logt^[gAXj ^ g-ACs log*22^1 log(t+C3 log 

Let g{t) = log(t + (73 logt)/(logt). Then 

P{X > (73 logt) < t-^C3+2Ci9(t)_ 

We can choose C 3 such that XC 3 — 4(7i > 2 and then t large enough that g{t) < 2 . The lemma 
follows. □ 

Lemma 10. Let e > 0. There is a constant C 2 such that PiA^) > 1 — e for suffieiently large T. 

Proof. For integers n > 1 and k > 0 such that 2"r + C 3 {k — 1) log(2"'+^r) < 2”+^T, define the 
interval 

In,k = [2^T + Csik - 1) log(2-+ir), 2-r + C3A:log(2-+ir)]. 

Let Bn^k be the event that at least ^(72 log(2"'T) types die during the time interval In^k- For any 
t such that 2”T < t < 2”'*'^r, the interval from t — C3logt to t is contained in ^ U In,k+i for 
some k. Therefore, if for some t such that 2”r <t< 2"'+^T, more than C 2 log t types die between 
times t — (73logt and t, the event Bn,k must occur for some k. It follows that 

OO 

p{Ai) 

n=l k 

SO we need to bound the probabilities P{Bn^k)- 
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Until time (, types die at times of a rate one Poisson process, so the distribution of the 
number of types that die during the interval In,k is dominated by the Poisson distribution with 
mean C 3 log(2"'"*“^r). If X has the Poisson distribution with mean A, then for all 0 > 0, we have 

P{X > aX) < e-^'^^E[e^^] = - 1 )_ 


Choosing 0 = log a, we get 

P{X > aX) < (4) 

To bound P{Bn^k)^ we need to apply (|1J) with A = C 3 log( 2 ”'''^T) and aX = ^(^2 log( 2 ”T). This 
means that a = ((72 log( 2 "'T))/( 2 C '3 log(2"'+^T)). We can choose C 2 large enough that, for all n, 
we have b = ( 73(0 log a — o + 1) > 1. For this choice of C2, we get 

P{Bn,k) < ^ 

For sufficiently large T, we have that for all n there are at most intervals In,k 

T, 

00 

P{Al) < ^( 2 "+^^)^-^ 

n=l 

which is less than e for sufficiently large T. 

Lemma 11. Let e > 0. For sufficiently large T, we have P{A^ n ^ 3 ) < e. 

Proof. Until time deaths occur at times of a rate one Poisson process. Denote the times of 
such a Poisson process by 0 < ri < r 2 < .... Define a sequence of independent random variables 
each having a uniform distribution on [0,1]. When a death event occurs, one type is 
chosen at random to die. Therefore, denoting the number of types at time t by M{t), we may 
assume that until time C) deaths occur at the times ri < r 2 < ... and that, at time Tj, if 
M{Ti—) > 2 then either the type at Lfri—) or R{Ti—) dies if and only if Ui < 2/M{t). 

Suppose T < t < K. On the event A 3 , the number of pathogens of a given type before time 
t is at most Ci logt, so the number of types is at least N{t)/{Ci logt) > 7 t/(C'i logt). It follows 
that X(t) is at most the number of times r* such that T < ti < t and Ui < (2Ci logTi)/( 7 Tj). 
Such times Ti occur at times of an inhomogeneous Poisson process of rate A(s) = (2Ci logs)/( 7 s). 
It follows that 0 ^ 3 ) is at most the probability that, for some t, such a Poisson process 

contains at least points between times T and t. 

For positive integers n, let tn = A^T. Let Bn be the event that there are at least ^tf points of 
the Poisson process between times T and If there is a f such that there are at least points 
between times T and t, then if n = min{m : 4”^r > t}, there are at least 
points between times T and tn, so Bn occurs. It follows that P{A'^ n A 3 ) < P{Bn)- 

To bound P{Bn), first note that for T > 1, the distribution of the number of points of the 
Poisson process between times T and tn is Poisson with mean 

2Ci log s ^ 2 Cilogt^ f*" -ds= 

Jt IS ~ 1 Ji S 7 ’ 

Thus, P{Bn) is at most the right-hand side of Q when A = 27“^C'i(logtn)^ and aA = ^tH"^. It 
follows easily that P{Pn) < e for sufficiently large T, which completes the proof. □ 


. For such 


□ 
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It remains to bound P{Aq). Informally, Aq is the event that whenever a type dies, creating 
a “hole” in the configuration, the hole hlls up quickly. 

Suppose a type dies at time t. The pathogens that died occupied some interval [it,rt]. If the 
sites it — 1 and rt + l are occupied at time t, then we say that a hole is created at time t. If s > t, 
we say that the hole exists until time s if there is a Z-valued process < u < s) such that: 

• H{t) E [it,rt]. 

• The site H{u) is empty for all u E [t, s]. 

• H{u—) — l<H{u)< H{u—) + 1 for all u E 

• L(u) < H{u) < R{u) for all u E [t, s]- 

For each u E [t, s], we think of H{u) as being a site in the hole that was created at time t. Over 
time, this site may move around within the hole so that no pathogen is born on it. Note that it 
need not be the case that H{u) E [it, p] for all u E [t, s]. For example, if the pathogen occupying 
site rt + l dies, the hole can exist beyond the time at which pathogens are born on all of the sites 
in [it,rt] if the site rt + l remains vacant. If no such process {H{u),t < u < s) exists, then we 
say the hole disappears by time s. 

Lemma 12. Let e > 0. For sufficiently large T, we have P{Aq n n A 3 ) < e. 

Proof. We call a hole long-lasting if it is created at time t < 2T — C3 log(2T) and exists until time 
t + C 3 log(2T), or if it is created at time t > 2T — C 3 log(2T) and exists until time t + C 3 logt. 
Assume T is large enough that 2T — C 3 log(2T) > T and that the function t ^ t — C 3 log t is 
increasing on [T, 00 ). If 2T < t < k and there are no long-lasting holes created before time 
K — C 3 log K, then every hole created before time t — C 3 log t disappears by time 

max{2T, t — C 3 logt + C 3 logft — C 3 logt)} < t. 

It follows that if there are no long-lasting holes created before time k — 6*3 log k, then Aq must 
occur. This is because if Aq does not occur, then there exist t E (2T, k), Cfc < ^ ~ C^logt, and 
X G Sk such that for all s E [Cfc, t], the site x is vacant at time s and L(s) < x < R{s). Therefore, 
if a new hole is created at time Cfcj then by taking H[s) = x for all s E [Ck,^], we see that the 
hole exists until time t, contradicting that there are no long-lasting holes. If a new hole was not 
created at time Cfc; then a hole created at an even earlier time lasts until time t, which gives the 
same contradiction. It thus remains to bound the probability that there is such a long-lasting 
hole. 

Suppose a hole is created at time t. The pathogens that died at time t occupied some interval 
[it,rt]. Label “a” the type occupying it — I at time t, and label “ 6 ” the type occupying + 1 at 
time t. Label “c” the type at site max{x < it : there is not a pathogen of type a at site x}, if 
this site is occupied. Likewise, label “d” the type at the site min{x > rt : there is not a pathogen 
of type b at site x}, if this site is occupied. 

Suppose t < 2T — C 3 log( 2 T), and that no holes created at earlier times are long-lasting. As 
long as a hole exists, pathogens are giving birth at rate A on the sites on the endpoints of the 
hole. Therefore, the size of the hole decreases by one at times of a rate 2A Poisson process, until 
the hole no longer exists. By Lemma El the probability that fewer than 2 Ci log( 2 r) points of 
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this Poisson process occur by time C 3 log( 2 T) is at most (2r)“^ for sufficiently large T. On ^ 3 , 
before time 2T there can be no more than Ci log(2T) pathogens of a given type. Therefore, if 
more than 2Ci log(2T) points of the Poisson process occur by time C' 3 log( 2 r), the hole will not 
exist at time t + C 3 log( 2 r) unless two of the types a, b, c, and d die before time t + C 3 log( 2 r). 
This is clear if all four types exist. If, for example, type c does not exist, then if type a dies 
at time t*, either the hole will not exist beyond time t* because L(t*) will be to the right of 
where the hole was previously, or the hole will merge with a hole born before time t, which by 
assumption is not long-lasting and therefore will not still exist at time t + C' 3 log( 2 r). However, 
before time 2T, there are always at least 2T types on Hi, so the rate at which one of these four 
types is dying is at most 4/(2T). Using that when X has a Poisson distribution with parameter 
A, we have P{X >2) < A^, we see that the probability that two of the four types die by time 
C' 3 log( 2 r) is at most [ 4 (^ 3 (log 2T)/2r]^, and therefore the probability that the hole created at 
time t is long-lasting is at most [( 4 C' 3 (log 2 r) -|- l)/ 2 r]^. 

Suppose instead 2T — C 3 log(2r) < t < k — C 3 log k, and that no holes created at earlier times 
are long-lasting. On H 3 , before time t + C 3 logt there can be no more than Ci log(t -|- C 3 logt) 
pathogens of a given type. Therefore, the hole can be long-lasting only if either two of the types 
a, b, c, and d die before time t + C^ log t, or if there are fewer than 2 C'i log(t -|- C 3 log t) points of 
a rate 2A Poisson process (whose points correspond to births at the endpoints of the hole) before 
time C 3 log t. Lemma IHl implies that the probability of the latter is at most for sufficiently 
large T. Since t + C 3 logt < k, the number of types during the interval from t to t + C 3 logt is 
always at least 7 t/(Ci log(t -|- C 3 log t)). Therefore, the rate of deaths of the four types is at most 
dCi log(t -|- C' 3 logt)/ 7 t, so the probability of at least two deaths during this time interval is at 
most [ 4 C'iC' 3 (logt) log(t -|- C 3 logt)/ 7 t]^. Therefore, the probability that the hole created at time 
t is long-lasting is at most [( 4 C'iC' 3 (log t) log(t -|- Cs logt) -|- 7 )/ 7 t]^. 

Since deaths occur at rate 1, the bounds in these two time intervals imply that 


P{Al n Hi n Hg 



2T-C3 iog( 2 r) 


/ 4U3(log2r) + l 
V 2T 


2 

dt 


+ 


L 


2T-C3 log( 2 T) 


f 4 C 1 C 3 (log t) log(t -b C 3 log t) + 'y 

V it 


2 

dt, 


which is less than e for sufficiently large T. 


□ 
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